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Byzantine  Clock  Synchronization 


Accesion  For 


All  published  fault  tolerant  clock  synchronization  protocols  are  shown  to  result  from 
refining  &  single  paradigm.  This  allows  the  different  clock  synchronization  protocols  to  be 
compared  and  permits  presentation  of  a  single  correctness  analysis  that  holds  for  all.  The 
paradigm  is  based  on  a  reliable  time  source  that  periodically  causes  events;  detection  of  such 
an  event  causes  a  processor  to  reset  its  clock.  In  a  distributed  system,  the  reliable  time 
source  can  be  approximated  by  combining  the  values  of  processor  clocks  using  a  generaliza¬ 
tion  of  a  "fault-tolerant  average",  called  a  convergence  function.  The  performance  of  a  clock 
synchronization  protocol  based  on  our  paiadigm  can  be  quantified  in  terms  of  the  two 
parameters  that  characterize  the  behavior  of  the  convergence  function  used:  accuracy  and 
precision. 
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1.  Introduction 


Certain  applications  require  that  synchronized  clocks  be  avaUable  to  processors  in  a  distributed 
system.  For  example,  the  accuracy  of  performance  statistics  computed  in  terms  of  elapsed  time 
between  events  at  different  sites  depends  on  how  closely  the  clocks  at  participating  sites  are  syn¬ 
chronized.  Also,  timeouts  and  other  time-based  synchronization  schemes  (such  as  the  state-machine 
approach  [Lamport  84])  involve  delays  that  are  proportional  to  how  closely  clocks  at  participating 
sites  are  synchronized.  And,  real-time  process  control  systems  require  that  accurate  timestamps  be 
assigned  to  sensor  values  so  that  these  values  can  be  correctly  interpreted. 

Other  applications  further  require  that  clocks  advance  at  approximately  the  same  rate  as  real 
time.  To  ensure  that  deadlines  can  be  met  in  real-time  process-control  applications,  tasks  are  usually 
broken  into  small  computations  and  scheduled  based  on  the  processor  clock.  If  a  clock  synchroniza¬ 
tion  protocol  suddenly  sets  that  dock  forward,  thereby  momentarily  increasing  its  rate,  the  processor 
might  not  be  able  to  ha  'die  in  a  timely  manner  all  the  t'isks  that  become  due.  Also,  clocks  are  some¬ 
times  used  to  assign  timestamps  to  events  so  that  it  is  possible  to  infer  potential  causality  between 
events.  For  example,  creation  times  of  files  i  re  usually  taken  to  define  the  order  in  which  those  files 
were  created.  A  clock  synchronization  protocol  that  suddenly  sets  a  clock  back  cculd  destroy  the 
consistency  of  time  with  respect  to  potential  causality. 

Even  if  we  could  start  all  processor  clocks  at  the  same  time,  they  probably  */ould  not  remain 
synchronized  for  long.  Crystal  clocks  found  in  today’s  processors  run  at  rales  that  differ  by  as  much 
as  10-6  seconds  per  second  from  real  time  and  thus  can  drift  apart  by  1  second  every  10  days;  clocks 
based  on  power-line  frequency  can  drift  considerably  more  than  this — when  used  as  a  time  base,  the 
power  grid  in  the  Northeastern  United  States  typically  drifts  4  to  6  seconds  from  real  time  over  the 
course  of  an  evening  [Mills  85],  Keeping  clocks  in  a  distributed  system  synchronized  without 
appealing  to  a  single,  centralized,  time  service  requires  that  clock  values  be  exchanged  and  clocks 
periodically  adjusted.  If  failures  can  result  in  faulty  processors  exhibiting  arbitrary  behavior,  then  the 
protocol  has  the  additional  burden  of  tolerating  erroneous  and  inconsistent  clock  values. 

This  paper  gives  a  single  paradigm  and  correctness  proof  that  can  be  used  to  understand  all  pub¬ 
lished1  fault-tolerant  protocols  for  keeping  clocks  in  a  distributed  system  synchronized  despite  faulty 
processors  that  can  exhibit  arbitrary  behavior.  The  paradigm  allows  us  to  identify  the  different  imple¬ 
mentation  choices  made  by  each  protocol  in  solving  three  subproblems  it  defines.  This  permits  the 

*E.g.;  [Babaoglu  &  Drummond  87],  [Cristian  el  at.  86],  [Halpem  et  at.  84],  [Kopetz  &  Ochsenreiter  87],  [Lamport  & 
Melliar-Smith  84],  [Lamport  &  Melliar-Smith  85],  [Lundelius  &  Lynch  84],  [Mahaney  &  Schneider  85],  and  [Srikanth  & 
Toueg  85], 
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various  Byzantine  clock  synchronization  protocols2  to  be  compared  and  the  contributions  of  each  to 
be  isolated.  Previously,  clock  synchronization  algorithms  were  viewed  in  terms  of  three  distinct 
classes:  those  based  on  convergence,  those  based  on  agreement,  and  those  based  on  diffusion  or 
flooding  of  messages.  Our  proof  is  interesting  because  it  necessarily  generalizes  all  of  the  correctness 
proofs  that  have  appeared  for  the  individual  clock  synchronization  algorithms.  Also,  it  is  the  first 
proof  in  which  clocks  arc  treated  as  advancing  at  discrete  times  ("ticks").  Previous  proofs  modeled 
clocks  as  monotonicaL'y  increasing  functions  foim  real  time  to  clock  time. 

The  remainder  of  the  paper  is  organized  as  follows.  Our  clock  synchronization  paradigm  is 
described  in  section  2.  Techniques  for  reading  clocks  across  a  computer-communications  network 
are  described  in  section  3.  In  section  4,  we  discuss  properties  of  convergence  functions,  the  central 
component  of  the  paradigm,  and  give  some  examples  of  convergence  functions.  Section  5  discusses 
how  agreement  protocols  can  be  used  in  implementing  a  convergence  function.  Conclusions  and 
related  work  appear  in  section  6.  Appendix  1  analyzes  the  performance  of  clock  synchronization  pro¬ 
tocols  derived  horn  our  paradigm  and  derives  bounds  for  various  parameters  of  such  a  protocol; 
appendix  2  contains  a  glossary  of  the  notation  used  in  the  paper. 

2.  A  Paradigm  for  Clock  Synchronization 

The  hardware  clock  at  a  correct  processor  p  can  be  viewed  as  implementing  a  function  cp.  This 
function  maps  a  real  time  r  to  a  clock  time  c/j,  is  non-decreasing  in  its  argument,  and  is  character¬ 
ized  by  positive  constants  p,  p,  and  k.  Constant  p  defines  the  range  of  initial  values  of  the  clock: 

Hardware  Initial  Value:  0  £  cp(0)  £  p.  (2.1) 

Constants  k  and  p  restrict  the  rate  that  clock  time  increases  as  a  function  of  real  time.  Physical  clocks 
are  counters  that  increase  by  1  in  response  to  periodically  generated  events  called  ticks.  In  a  physical 
clock,  the  (real  time)  interval  between  ticks  can  vary  and  this  can  cause  the  clock  value  to  advance  at 
a  different  rate  than  real  time.  For  our  purposes,  it  is  more  convenient  to  model  a  hardware  clock  as 
having  a  fixed  (real  time)  interval  k  between  ticks,  but  advancing  by  a  varying  real  number  value  v, 
where  (l-p)K£v£(l+p)K,  in  response  to  each  tick.  That  is,  we  require 

C»(f+K)-Cp(0 

Hardware  Rate:  0  <  1-p  <:  — - - — - —  £  1+p  forOSf,  (2.2) 

where  k  is  called  the  tick  width  and  p  the  drift  rate  of  the  clock.  Notice  that  (2.2)  does  not  require  the 
value  of  cp  to  remain  fixed  between  successive  ticks  although  for  most  clocks  this  will  be  the  case; 


*In  the  distributed  computing  literature,  arbitrary  behavior  in  -esponse  to  a  failure  is  called  Byzantine  behavior.  Clock 
synchronization  protocols  that  can  tolerate  such  failures  are  called  Byzantine  ck  ;k  synchronization  protocols. 


Hardware  Rate  (2.2)  merely  ensures  that  rate  at  which  the  clock  advances  is  within  p  of  the  rate  at 
which  real  time  passes. 

We  make  no  assumptions  about  the  behavior  of  clocks  at  faulty  processors — not  even  that  they 
can  be  modeled  by  functions.  A  clock  at  a  faulty  processor  need  not  increase  as  real  time  passes  and 
might  give  inaccurate  or  conflicting  information  when  it  is  read. 

A  clock  s  mchronization  protocol  implements  a  virtual  clock  cp  at  each  processor  p.  A  virtual 
clock,  like  a  hardware  clock,  is  a  function  that  maps  real  time  t  to  clock  time  cp(t),  is  non-decreasing 
in  its  argument,  ind  is  characterized  by  positive  constants  ji,  p,  and  k  such  that  for  correct  processors 
p  and  ^ 

Virtual  Synchronization:  lc,(f)-cf(f)l  S  8  forOSr,  (2.3) 

Virtual  Rate:  0  <  1-p  S  fg«+K)-c,(»  ^  for0<^  (2.4) 

K 

*  A 
S  characterizes  how  closely  virtual  clocks  are  synchronized  with  each  other,  p  is  the  drift  rate  of  vir¬ 
tual  clocks:  and  ic  specifies  the  (real-time)  interval  between  virtual  'dock  ticks. 

If  a  reliable  time  source  is  available,  then  satisfying  (2.3)  and  (2.4)  is  simple.  The  reliable  time 
source  periodically  distributes  the  correct  time  to  all  processors  and,  upon  receipt  of  this  correct  time, 
a  processor  adjusts  its  virtual  clock  accordingly.  Provided  the  time  is  distributed  frequently  enough, 
processor  clocks  will  not  drift  too  far  apart  in  the  interval  between  adjustments,  so  (2.3)  will  be  main¬ 
tained.  And,  provided  no  processor  has  to  adjust  its  clock  by  too  much,  the  adjustment  can  be  spread 
over  the  interval  that  precedes  the  next  resynchronization  and  (2.4)  will  be  maintained.  We  have  only 
to  implement  the  reliable  time  source. 

The  reliable  time  source  serves  two  functions  in  the  clock  synchronization  protocol  just  out¬ 
lined.  First,  it  periodically  generates  an  event  that  when  detected  by  a  correct  processor  causes  that 
processor  to  resynchronize  its  clock.  This  can  be  formalized  in  terms  of  constants  r^,  r^,  and  P 
as: 

RTS1:  A  reliable  time  source  generates  a  sequence  of  events  at  real  times  t]RTS,  t\js  ...  such 
that 

t}iTS  =  0  A  (V  0  <1:  r  min  £  (MTS  ~  tlRTS  max) 

and  the  real  time  tp  at  wliich  a  processor  p  detects  the  event  produced  at  t‘RTS  satisfies 
4=0  a  (Vi:  1  Si:  OSlJ-jjwsSp). 
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(Choosing  0  models  the  fact  that  the  protocol  and  clocks  start  at  real  time  0). 

Second,  in  addition  to  causing  events,  the  reliable  time  source  facilitates  clock  synchronization 
by  providing  a  value  to  each  correct  processor. 

RTS2:  At  t‘p,  processor  p  obtains  a  value  V‘p  that  can  be  used  in  adjusting  cp  to  be  consistent 
with  (2.3)  and  (2.4). 

There  are  two  things  to  note  about  RTS1  and  RTS2.  First,  these  properties  do  not  imply  that  the 
correct  time  is  always  available  to  processors— only  that  it  is  available  periodically.  Having  a  reli¬ 
able  time  source  from  which  the  correct  time  is  always  available  results  in  a  different  clock  synchron¬ 
ization  paradigm  than  the  one  just  described.3  Second,  RTS2  does  not  stipulate  that  the  same  value  be 
obtained  by  every  processor— only  that  the  value  provided  can  be  used  to  achieve  (2.3)  and  (2.4). 
This  permits  values  obtained  by  different  processors  to  be  compensated  for  known  delays  due  to  pro- 
toco)!  execution  and  message  delivery. 

Although  it  is  easy  to  satisfy  RTS1  and  RTS2  using  a  single  clock,  the  resulting  time  source  is 
only  as  fault  tolerant  as  that  clock  is.  A  reliable  time  source  that  does  not  depend  on  correct  operation 
of  a  single  dock  can  be  constructed  by  using  approximately  synchronized  clocks  in  a  distributed  sys¬ 
tem.  RTS1  is  achieved  by  using  the  individual  processor  docks  to  signal  the  periodic  resynchroniza¬ 
tion  events;  RTS2  is  achieved  by  having  processors  compute  some  type  of  fault-tolerant  average  of 
the  values  of  the  docks  at  processors  in  the  system. 

To  describe  the  implementation  of  RTS1  and  RTS2  in  a  distributed  system,  it  will  be  con¬ 
venient  to  view  resetting  a  virtual  dock  cp  as  starting  another  virtual  clock  that  runs  concurrently 
with  the  old  one.  Thus,  initially  (i.e.  at  real  time  0)  p  uses  virtual  clock  cp  and  p  starts  a  new  virtual 
clock  cp*1  at  real  time  f},+1 ,  when  it  detects  the  resynchronization  event  produced  at  time  Using 
this  convention,  the  value  of  cp(t )  is  characterized  by 

t‘p£t<t‘p*1  =>  cp(t)=c‘(t). 

A  virtual  clock  clp  is  implemented  at  processor  p  using  the  hardware  clock  cp  at  processor  p  and 
adding  an  adjustment  value  that  is  maintained  by  the  dock  synchronization  protocol.  Formally,  this 
is  given  by 

c;(r)  ■  cp(t)  +  FfX‘p(cp(t))  (2.5) 


’While  we  have  been  able  to  design  clock  synchronization  protocols  based  on  this  other  paradigm,  we  have  so  far  been 
unable  to  develop  a  generic  proof  of  correctness  for  them. 


where  FIXP(T)  is  a  function  from  clock  time  at  processor  p  to  a  correction  for  hardware  clock  cp.A 
Thus,  p  reads  cp  at  real  time  t  by  reading  cp  and  then  adding  the  appropriate  adjustment  value  based 
on  the  current  value  of  FIX). 

So  that  a  virtual  clock  does  not  violate  Virtual  Rate  (2.4),  the  value  of  FIX)  must  change  gradu¬ 
ally  as  a  function  of  time.  Therefore,  FIXP(T)  spreads  any  change  in  its  correction  to  cp  over  adjust- 
irtent  interval  AI  clock  seconds,  a  parameter  of  the  protocol.  The  following  definition  of  FIXP(T) 
achieves  this.  In  it,  adjp~l  is  the  adjustment  to  cp  necessary  to  implement  c)~l  and  adjj,  is  the  adjust¬ 
ment  to  implement  cp,  so  adjj,-adj)~l  is  the  additional  amount  that  FIX),  must  add  to  cp  over  the  AI 
clock-second  interval  starting  from  t)  in  order  to  approach  cp: 


FIX)(T)  m  adjjr1 


(adjp  -  adjjrlXmin(  T-cp(t‘p),  AI )) 
+ - A/ - 


The  key  to  this  definition  is  that  OSmin(T-t'f(rj,),  A1)£AI>  so  that  the  change  from  adj)~l  to  adjj,  is 
gradually  spread  out  over  AI  clock  seconds. 


For  FIX)  to  work,  AI  must  be  long  enough  to  avoid  violating  the  drift  rate  bounds  for  cp  in  Vir¬ 
tual  Rate  (2.4);  however,  it  must  not  be  too  long  or  else  Virtual  Synchronization  (2.3)  cculd  be 
violated  or  the  next  superscripted  clock  might  be  started  before  the  full  adjustment  has  been  com¬ 
pleted,  leaving  an  even  larger  adjustment  to  be  performed.  The  case  where  Al^k  is  called  instan¬ 
taneous  resynchronization\  otherwise  continuous  resynchronization  occurs.  Appendix  1  characterizes 
values  for  AI  and  other  parameters  cf  our  paradigm  that  ensure  Virtual  Synchronization  (2.3)  and  Vir¬ 
tual  Rate  (2.4)  hold. 


To  implement  RTS2,  we  use  a  function  CF  that  essentially  averages  the  values  of  the  approxi¬ 
mately  synchronized  clocks  at  correct  processors  in  the  system.  In  a  system  of  N  processors,  VJ,+1  of 
RTS2  is  defined  by5 

Vj+l=CF(p,ctitj;1) . cA(r^)), 

where  CF  is  called  a  convergence  function  because  it  brings  clocks  closer  together.  Given  this 
definition  of  VJ,+1 , 

adjj;1  =  cf(p  ,  ci  (tj;1 ) . cMtj;' »  -  cp(tj;' )  (2.6) 

is  the  amount  that  CpO)*1)  differs  from  cp(t)fl)  and  we  can  now  give  the  clock  synchronization 


4In  this  paper,  clock  times  are  denoted  by  upper-case  letter i  and  real  tunes  by  lower-case  letters. 

^Evaluating  CF(p,  c{  (t)*'  X  ....  £>(***' ))  seemingly  requires  that  p  be  able  to  read  at  the  same  instant  all  the  virtual 
clocks  maintained  by  other  processors.  Section  3  explains  how  to  get  around  this  problem. 


protocol  for  a  processor  p  in  a  distributed  system  consisting  of  N  processors.  It  appears  in  Figure 
2.1. 6  Three  important  things  about  that  protocol  are  unspecified.  They  are 

•  the  implementation  of  "detect  event  generated  at  time 

•  how  one  processor  reads  the  virtual  clocks  at  other  processors,  and 

•  convergence  function  CF. 

Different  choices  for  these  result  in  different  clock  synchronization  protocols.  In  fact,  the  various 
choices  permit  viewing  in  terms  of  our  paradigm  all  the  published  clock  synchronization  protocols 
that  do  not  make  use  of  an  externa)  time  source. 

Much  of  this  paper  is,  therefore,  devoted  to  different  implementation  choices  for  the  unspecified 
aspects  of  Figure  2.1.  The  remainder  of  this  section  discusses  different  implementations  of  "detect 
event  generated  at  time  Section  3  discusses  methods  ways  that  one  processor  can  read  the  vir¬ 
tual  clocks  at  other  processors.  And,  sections  4  and  5  give  properties  and  examples  of  convergence 
functions. 


i  :=  1; 

adj%  :=  0;  adj\  :»  0; 

do  forever 

detect  event  generated  at  time  rj£&; 
rj,+1  :=  real  time  now; 

ad]?1  :=CF(p,  cj(t‘p+l) . c£(r*,+1))  -  c„(r*+l); 

i  :=  i+1 

od 

Figure  2.1.  Clock  synchronization  protocol 


•There  and  throughout,  we  use  the  notational  device  "f  :=  real  time  now"  u  a  way  to  talk  about  the  value  of  a  clock 
during  execution.  Variable  t  is  not  actually  implemented  and  is  not  directly  accessible  to  the  program,  although  cr{>)  is. 


Detecting  Resynchronization  Events 

An  obvious  approach  to  implementing  "detect  event  generated  at  time  uses  the  approxi¬ 
mately  synchronized  virtual  clocks.  For  rome  predefined  value  R,  each  processor  p  waits  until  Cp 
reads  iR  before  starting  cfl .  Either  an  interval  timer  or  busy-waiting  can  be  employed  to  implement 
this  waiting. 

In  this  scheme,  rfen  is  the  earliest  real  time  that  some  correct  processor's  virtual  clock  has  value 
iR.  Since  virtual  clocks  at  correct  processors  can  advance  as  quickly  as  1+p  clock  seconds  per  real 
second,  r^-K/O+p);  and  since  drey  can  advance  as  slowly  as  1-p  clock  seconds  per  real  second, 
'W*/?/(l-p).  To  compute  {3,  note  that  at  the  time  the  fastest  correct  clock  reads  iR,  due  to  (2.3)  the 
slowest  correct  clock  must  read  at  least  iR- 8.  Thus,  this  (slow)  correct  clock  might  take  as  long  as 

A  A 

5/(1  -p)  real  seconds  until  it  reaches  iR\  so,  P*8/(l  -p). 

Another  implementation  of  "detect  event  generated  at  time  rj&"  is  for  each  processor  to  broad¬ 
cast  a  message  when  its  virtual  clock  reaches  some  predefined  value  and  to  resynchronize  when  such 
a  message  has  been  received  from  a  correct  processor.  Here,  P  is  bounded  by  the  variance  in  the 
(real-time)  delay  of  performing  the  broadcast.  The  details  of  this  scheme,  which  is  based  on  a  simple 
form  of  agreement,  are  given  in  section  5. 

3.  Reading  Clocks  from  Afar 

Processors  have  access  to  clock  time,  not  real  time.  This  means  that  in  order  for  a  processor  p 
to  obtain  the  arguments  to  CF  needed  to  compute  adj'pX  (see  (2.6)),  p  must  obtain  cp(fj,+1),  c\ (rj,+l), 
....  Cs(tp  l),  which  requires  that  it  read  N  clocks  simultaneously.  This  is  impossible  for  two  reasons. 
First,  without  special  hardware  a  processor  can  read  only  one  clock  at  a  time.  Second,  in  a  distributed 
system,  processors  do  not  necessarily  have  access  to  each  others'  clocks. 

One  solution  to  both  of  these  problems  is  for  each  processor  locally  to  implement  approxima¬ 
tions  of  the  virtual  clocks  at  other  processors.  Processor  p  maintains  a  collection  of  tables  x),[l ..  N] 
that  can  be  used  to  compute  an  approximation  for  c^(f),  and  p  approximates  c,(r)  at  real  time  r  by 
Cp{t)Wp[q].  Thus,  p  can  approximate  ci(^+l),  ....  C/v(rj,+1),  simply  by  reading  cp(t‘pl)  once  and 
using  it  and  xj,  to  compute  the  N  values  needed. 

In  one  technique  to  construct  xj,,  fir.it  described  in  [Lamport  &  Melliar-Smith  84],  processor  p 
periodically  communicates  with  the  other  processors  in  the  system.  Suppose  the  minimum  and  max¬ 
imum  delays  (according  to  the  clock  at  any  correct  processor)  incurred  in  sending  a  message  from  one 
correct  processor  to  another,  receiving  it,  and  processing  it,  are  and  r^,,  A  processor  p  can 
compute  by  executing 
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Mnd  "I*  dock  time?"  to  q; 

receive  C  from  q  timeout  after  2F,W; 

If  timed-out  then  C 
:■  reel  time  now; 

1 1*  C 

Processor  q  responds  to  a  "i*  dock  time?"  request  from  p  by  sending  back  cf  (t^),  where  t ^  is  the 
real  time  the  reply  is  sent 

Define  clock  reading  error  \p(q)  to  be  the  error  in  p‘s  approximation  of  cj.  Let  A  be  the  max¬ 
imum  clock  reading  error  for  any  pair  of  correct  processors.  That  is, 

(vp,*,i:  ic'(o-c,(o-T;tan  s  x;(g)  s  a). 

In  order  to  bound  A,  first  note  that  p's  approximation  of  q' s  dock  can  drift  away  from  q's  clock  by  at 
most  p+p  clock  seconds  per  real  second  because  the  rate  error  of  cf  is  bounded  by  p  and  the  rate 
error  of  c,  is  bounded  by  p.  Initially,  is  in  error  by  at  most  F^-r^  since  only  r**  of  tire 
message  delay  incurred  by  q's  response  to  p's  time  request  is  accounted  for  in  the  calculation  of 

].  Thus,  at  (real)  time  r,  \p(q)  satisfies 

x;(<7)  s  r^-r^+oj+pKf-freo^^))  s  a  (3.d 

where  lreadp(q)  is  the  real  time  that  p  last  executed  an  assignment  to  \'p[q  ]  in  the  dock  reading  pro¬ 
tocol  above.  Although  Vp(q';  is  a  function  of  t.  an  upper  bound  on  t-lreadp(q)  is  usually  known,  and 
therefore  A  can  be  treated  as  a  constant 

Error  k^(q)  can  be  kept  small  by  recomputing  xjfa]  frequently,  thereby  keeping  t-lrcadp(q) 
small.  In  practice,  it  suffices  to  obtain  dock  values  from  all  processors  just  before  computing  adjp+l , 
becau*^  this  minimizes  the  dock  reading  error  just  before  the  dock  values  are  actually  needed.  How¬ 
ever,  for  reasonable  intervals  t-lreadp(q),  (p+py,f-lreadp(.q))<^max  -  T**,  so  minimizing  the 
uncertainty  in  the  network  delay  is  the  key  to  reducing  Uncertainty  in  network  delay  can  be 

reduced  by  installing  the  dock  reading  protocol  in  the  lowest  level  of  the  operating  system.  This  is 
because  a  large  part  of  the  uncertainty  in  network  delay  can  be  attributed  to  uncertainty  in  program 
execution  time  due  to  interrupts  and  other  forms  of  multiprogramming.  The  time  it  takes  a  message 
to  traverse  a  wire  connecting  computers  does  not  have  a  high  variance.  Even  when  messages  are 
routed  through  inteimediate  sites,  delays  due  to  queuing  in  sites  doing  relaying  can  be  measured  and 
recorded  in  the  message  and  therefore  can  be  accounted  for. 

A  variation  on  the  dock  reading  scheme  just  given,  used  in  the  clock  synchronization  protoco'v 
of  (Babaoglu  &  Drummond  87],  [Cristian  ct  al.  86],  [Halpem  et  al.  84],  [Lundelius  &  Lynch  84],  and 
[Srikanth  &  Toueg  85],  reduces  the  number  of  messages  by  half  but  can  increase  dock  reading  error. 
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Instead  of  requesting  the  time,  each  processor  q  periodically  broadcasts  its  virtual  clock  value  (includ¬ 
ing  superscript  0-  Upon  receipt  of  such  a  menage,  the  receiver/*  updates  x£[q]  as  follows. 

receive  C  from  q 

U*  :■  teal  time  now; 

]  i*  C 

The  reduction  in  number  of  messages  sent  using  this  scheme  is  due  to  lack  of  explicit  request 
messages— foe  passage  of  time,  redier  than  an  explicit  request  message,  causes  transmission  of  a 
clock  value.  However,  in  a  point-to-point  network,  clock  reading  errors  can  increase  when  this 
scheme  is  used.  This  increase  is  because  a  processor  p  does  not  necessarily  know  what  communica¬ 
tions  line  it  should  monitor  for  the  next  clock  message  it  will  receive.  Polling  communications 
lines — even  when  done  by  processor  microcode — increases  I'm,,  since  it  is  possible  for  a  message  to 
remain  queued  at  foe  receiver  for  an  entire  polling  cycle.  Since  polling  does  not  increase  rmia,  the 
effect  is  to  increase  - 1%*,,  which,  according  to  (3.1),  increases  X£(q).  Local  area  networks, 
which  usually  have  a  single  connection  between  the  processor  and  network,  do  not  have  this  problem. 

4.  Convergence  Functions 

A  convergence  function  CF  for  use  in  a  system  ofN  processors  is  a  function  of  N +1  arguments 
that  satisfies  certain  properties.  The  first  argument  identifies  foe  processor  evaluating  CP;  each  of  the 
following  arguments  x,,  lSqiN,  is  a  value  from  processor  q.  The  properties  required  of  conver¬ 
gence  functions  are  given  below.  These  properties  are  used  in  the  proofs  of  Virtual  Synchronization 
(2.3)  and  Virtual  Rate  (2.4)  given  in  Appendix  1.  Thus,  this  abstract  characterization  of  convergence 
functions  is  what  permits  the  single  set  of  proofs  of  Appendix  1  to  apply  to  a  collection  of  clock  syn¬ 
chronization  protocols. 

The  first  property  required  for  a  function  CF  to  be  a  convergence  function  is  tlu..  it  be  monoton- 
i cally  non-decreasing  in  »ts  last  N  arguments. 

Monotonicity:  If(Vf:  1  ZiZN:  XiZyd  then  CF(p,x\,  ...,x*)  S  CF(p,y\,  ...,yN)> 

When  CF  is  used  for  clock  synchronization,  arguments  xx  through  are  time  values,  and  this  pro¬ 
perty  states  font  foe  value  of  foe  Reliable  Time  Source  does  not  decrease  as  time  passes. 

The  next  property  asserts  that  the  relative  magnitudes  of  the  virtual  clock  values — and  not  their 
absolute  values — matter  when  they  are  combined  to  produce  the  value  provided  to  p  for  RTS2.  Thus, 
CF  satisfies 

Translation  Invariance:  CF(p,x i+v . xN+v)  =  CF(p,X\,  ...,xN)+v  forOSv. 

This  property  allows  values  of  CF  computed  by  different  processors  at  different  times  to  be 
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compared.  If  in  the  evaluation  of  CF  by  one  processor,  the  values  of  arguments  xt  through  xN  are 
shifted  by  the  same  amount  (reflecting  the  passage  of  time)  from  the  values  used  by  the  other,  then 
the  result  computed  by  the  first  will  be  shifted  by  that  amount  from  the  remit  computed  by  the 
second. 

Third,  we  require  that  the  values  of  CF  for  two  different  processors  p  and  q  using  similar  values 
for  at  least  N-k  corresponding  arguments  be  closer  than  Xp  and  were.  This  is  the  reason  CF  is 
called  a  "convergence  function".  The  utility  of  a  convergence  function  in  this  regard  is  characterized 
by  a  constant  k  called  the  fault-tolerance  degree  and  a  function  a  called  the  precision ?  Fault- 
tolerance  degree  specifies  the  number  of  argument  values  that  can  differ  significantly  in  the  evalua¬ 
tion  of  CF  by  p  and  the  evaluation  of  CF  by  q  without  greatly  affecting  the  difference  in  the  results; 
precision  specifies  how  doss  together  values  obtained  by  these  two  evaluations  must  be.  This  is  for¬ 
malized  by  the 

Prcdsion  Enhancement  Property:  I  CF(p,X\, ...,%)  -  CF{q,y\,  ...,y*)  I  £x(8,e)if 

(a)  at  least  N-k  of  the  z,  's  are  within  8  of  each  other, 

(b)  the  y,’s  corresponding  to  those  N-kXi's  are  within  5  of  each  other,  and 

(c)  for  each  of  the  N-k  argument  pairs,  ly,— x,- 1  £  e. 

Conditions  (a)  and  (b)  define  5  to  be  the  width  of  the  interval  spanned  by  values  from  correct  proces¬ 
sors;  when  using  CF  to  implement  a  reliable  time  source,  this  condition  is  satisfied  if  virtual  clocks  at 
correct  processors  are  synchronized  to  within  6  when  read  by  p  and  q.  Condition  (c)  stipulates  that 
corresponding  (correct)  arguments  to  CF  are  at  most  e  apart;  for  a  reliable  time  source,  this  condition 
is  satisfied  if  two  values  obtained  by  reading  the  same  virtual  dock  v  (real)  seconds  apart,  for  small 
values  of  v,  do  not  differ  by  more  than  v  -f  e  as  a  result  of  drift. 

The  Precision  Enhancement  Property  states  that  in  order  for  CF  to  be  a  convergence  function, 
two  evaluations  must  produce  values  that  are  dose — at  most  x(5,  e)  apart— provided  correct  values 
are  within  6,  even  though  the  values  used  for  k  of  the  arguments  (presumably,  from  faulty  processors) 
differ  arbitrarily  and  each  remaining  pair  of  corresponding  arguments  differs  by  at  most  e.  Provided 
ic(5,  e)  <  S,  CF  implements  a  time  source  that  furnishes  different  processors  with  time  values  that  are 


7Our  use  of  the  term  precision  is  based  on  is  usual  definition  in  connection  with  data  and  error  analysis  in  the  physical 
sciences  [Bevington  69].  There,  "precision"  is  a  measure  of  how  exactly  s  result  is  determined  and.  therefore,  how  reprodu¬ 
cible  that  result  is.  When  used  in  this  sane,  precision  asserts  nothing  about  whether  the  result  is  close  to  the  quantity  actual¬ 
ly  being  measured — just  that  it  is  close  to  other  results  that  measure  that  quantity.  The  term  "accuracy"  is  reserved  for 
characterizing  how  dose  a  result  is  to  the  true  value  it  measures. 
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closer  than  the  least  synchronized  virtual  clocks  at  correct  processor. 

The  final  property  of  a  convergence  function  Cr  asserts  that  CF(p,  x  \ , ....  xN)  is  not  more  than 
a(8)  away  from  any  correct  argument,  where  any  argument  found  within  a  6  width  interval  contain¬ 
ing  N-k  or  more  arguments  is  considered  correct. 

Accuracy  Preservation  Property:  Let  Xok  be  a  subset  of  x  i , %  whose  members  are  within 
8  of  N-k- 1  of  x i , ....  xn.  Then, 

(Vp:  xp  e  X0K:  I  xp  -  CF(p,  xx . xN)  I  £  a(8». 

An  obvious  consequence  of  this  definition  is 

<x(8)£8.  (4.1) 

When  CF  is  used  as  a  reliable  time  source  and  correct  clocks  are  synchronized  to  within  5,  ot(5) 
bounds  the  maximum  amount  by  which  virtual  clock  at  a  processor  p  must  be  adjusted.  That  is,  for 
all  correct  processors  p : 

(V»:  0<i:  \adjipi  -adjp  I  £ot(S».  (4.2) 

Function  a  is  called  the  accuracy  of  CF.  This  (in  the  sense  of  [Bevington  69])  is  an  apt  name 
for  two  reasons.  First,  a  bounds  the  rate  change  made  to  a  virtual  clock  cp  (through  FIXP),  thereby 
bounding  the  "accuracy"  of  the  rate  of  that  virtual  clock.  Second,  insofar  as  the  clock  at  any  correct 
processor  q  approximates  the  real  time  and  is  therefore  considered  the  true  value  of  interest,  a  bounds 
the  difference  between  the  value  of  a  newly  reset  virtual  clock  and  that  true  value. 

Examples  of  functions  that  satisfy  the  three  properties  of  convergence  functions  include: 

Egocentric  Average:  Cf£A(p,xlt...,x^)  is  the  average  of  all  arguments  xx  through  xs  that  are 
no  more  than  8  from  xp. 

Fast  Convergence  Algorithm:  CFpcA{p,x i, ...,%)  is  the  average  of  all  arguments  xx  through 
xs  that  are  within  8  of  at  least  N-k  other  arguments. 

Fault-tolerant  Midpoint:  CFMid(p,x\r..,xs)  is  the  midpoint  of  the  range  spanned  by  argu¬ 
ments  x  i  through  xs  after  the  k  highest  and  k  lowest  values  have  been  discarded. 

Fault-tolerant  Average:  CFAvg(p,X\t...,xs)  is  the  average  of  arguments  X\  through  %s  after 
the  k  highest  and  k  lowest  value;  have  been  discarded. 

The  fault-tolerance  degree  k,  precision  n(8,  e)  when  there  are  /faulty  processors,  precision  when  f=k 
as  N  goes  to  infinity,  and  accuracy  a(8)  for  each  of  the  above  functions  is  given  in  Figure  4.1.  The 
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Name 

Fault-tolerance 
degree  k 

Precision  it(8,  e) 
(/■faults) 

Worst  Precision 
(i.e.  f-k) 

Accuracy  a(8) 

CFea 

N-\ 

3 

8+e 

48 

3 

CFfca 

N- 1 

3 

¥*■ 

48 

3 

CFmu 

N-l 

3 

h 

8 

CF Avg 

N- 1 

/s_+c 

8+e 

5 

3 

N-2*  C 

CFcca 

N- 1 

3 

£ 

N 

fe 

48 

3 

CFByt 8 

N-l 

2 

2A 

2A 

8 

CFfw  SE18 

N-\ 

r^ci+p) 

d-p) 

F max  (1 +P) 

(1-P) 

Fmax +2(8-rmi,) 

CFfw  SE28 

N-± 

2 

r^a+p) 

(i-p) 

r^d+p) 

d-P) 

Figure  4.1.  Properties  of  Convergence  Functions 


other  Convergence  Functions  mentioned  in  the  figure  are  discussed  in  section  5.  CFea  was  first 
presented  and  analyzed  in  [Lamport  &  Melliar-Smith  85]  in  connection  with  their  interactive 
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convergence  clock  synchronization  algorithm.  CFfca  was  proposed  in  [Mahaney  &  Schneider  85]. 

CFMid  and  CFAvg  are  given  in  [Dolev  et  al.  83];  CFAvg  is  the  basis  for  the  clock  synchronization  pro¬ 
tocol  of  [Lundelius  &  Lynch  84]  and  the  (AMI  S65C60)  VLSI  clock  synchronization  chip  described 
by  [Kopetz  &  Ochsenreiter  87].  Characterizing  convergence  functions  in  terms  of  precision  and 
accuracy  was  first  done  by  [Mahaney  &  Schneider  85];  most  of  the  precision  and  accuracy  functions 
given  in  Figure  4.1  were  first  reported  there. 

5.  Using  Agreement  for  Convergence 

An  agreement  protocol  allows  correct  processors  in  a  distributed  system  to  agree  on  an  action 
or  a  set  of  values.  This  can  help  in  two  ways  when  implementing  a  Reliable  Time  Source.  First,  use 
of  an  agreement  protocol  to  disseminate  a  signal  that  causes  processors  to  resynchronize  clocks  can 
be  used  to  satisfy  RTS1.  Second,  use  of  an  agreement  protocol  to  disseminate  each  processor's  clock 
can  ensure  that  arguments  in  corresponding  positions  in  evaluations  of  CF  performed  by  different 
processors  are  equal,  thereby  enhancing  the  precision  of  CF  and  helping  to  satisfy  RTS2. 

Crusader’s  Agreement  [Dolev  821  allows  a  designated  processor,  called  the  transmitter,  to 
disseminate  a  value  in  such  a  way  that: 

CRU1:  All  correct  processors  that  do  not  "know"  that  the  transmitter  is  faulty  agree  on  the 
same  value. 

CRU2:  If  the  transmitter  is  correct  then  all  correct  processors  agree  on  its  value. 

Thus,  Crusader’s  Agreement  potentially  partitions  processors  into  three  classes:  those  that  are  faulty, 
those  that  are  correct  and  "know"  that  the  transmitter  is  faulty,  and  those  that  are  correct  and  have 
agreed  among  themselves  on  a  value  from  the  ones  sent  by  the  transmitter.9  Crusader’s  Agreement  is 
simple  and  inexpensive  to  implement  in  a  distributed  system  where  fewer  than  1/3  of  the  processors 
are  faulty  and  reliable  communications  is  possible. 10  j 

I 

Byzantine  Agreement  [Lamport  et  al.  82]  is  stronger  (but  more  expensive  to  achieve)  than 
Crusader’s  Agreement — all  correct  processors  agree  on  a  value  whether  or  not  the  transmitter  is 
faulty: 

B  YZ1  •  All  correct  processors  agree  on  the  same  value. 

*If  the  transmitter  is  correct  tnen  the  set  of  correct  processors  that  "know"  that  the  transmitter  is  faulty  will  be  empty. 

10A  communications  failure  can  always  be  viewed  as  a  future  of  either  the  sending  or  receiving  processor.  Assuming 
reliable  message  delivery  here  is  merely  an  expository  convenience.  , 
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BYZ2:  If  the  transmitter  is  correct  then  all  correct  processors  agree  on  its  value. 

The  literature  contains  numerous  protocols  for  establishing  Byzantine  Agreement.  An  early  survey  of 
the  area  appears  in  [Fisher  83]  and  a  tutorial  in  (Schneider  85]. 

5.1.  Agreement  with  Clocks 

Protocols  to  implement  Crusader’s  Agreement  and  Byzantine  Agreement  usually  proceed  as  a 
series  of  rounds.  In  the  first  round,  the  transmitter  sends  its  value  to  every  other  processor.  In  subse¬ 
quent  rounds,  each  processor  sends  a  copy  of  every  value  it  has  received  to  every  other  processor. 
Eventually,  each  processor  selects  one  from  among  the  set  of  values  it  has  received.  The  criteria  for 
selection  depend  on  the  protocol — use  of  median  or  mode  is  not  unusual.  Relaying  messages  through 
different  paths,  although  seemingly  inefficient,  is  necessarv  because  it  prevents  correct  processors 
from  being  confounded  by  inconsistent  values  sent  along  different  routes  hy  faulty  processors. 

An  agreement  protocol  intended  for  disseminating  values  must  be  modified  for  use  in  dissem¬ 
inating  clocks.  This  is  because,  while  operations  like  making  copies  of  values  and  sending  such 
copies  through  a  network  are  simple,  making  copies  of  clocks  and  sending  them  through  a  network  is 
not.  The  key  to  avoiding  this  problem  is  to  compute  and  send  clock  differences  rather  than  the  clocks 
themselves  [Lamport  &  Melliar-Smith  84], 

To  implement  this  scheme,  cx  is  encoded  as  a  triple  iproc,  i,  offset )  that  specifies  cx  has  differ¬ 
ence  offset  from  c^.  Thus,  cx(t)  can  he  approximated  by  a  processor  p  as  cp(jt)+xlp[proc]+ offset. 
This  allows  p  to  copy  and  send  cx  to  another  processor  q  by  executing 

send  Iproc,  i,  offset )  to  q. 

Processor  q  receives  this  copy  by  executing 
receive  iproc',  e,  offset') 

and  thereafter  approximates  cx  at  time  r  by  evaluating  cq  (r)-K  J  iproc'] + offset'. 

When  a  clock  is  approximated  in  this  manner,  error  is  introduced  by  passing  that  clock  from  p 
to  q  because  cq(t)+xq[proc]  is  only  an  approximation  for  Cproe(t).  This  means  copies  of  cx  that 
traverse  different  routes  and  are  received  by  a  single  processor  might  not  be  identical,  even  though 
they  should  be.  Consequently,  equality  tests  or  selection  of  a  clock  based  on  the  mode  of  a  set  of 
clocks  received  cannot  be  used  when  clocks  are  passed  around  the  system  in  this  fashion. 

Two  schemes  have  been  devised  for  modifying  an  agreement  protocol  to  avoid  these  problems 
with  inequality  of  clock  copies.  The  first  is  for  the  agreement  protocol  to  be  formulated  in  a  way  that 
avoids  using  equality  tests  to  select  one  from  among  the  different  (clock)  copies  received.  Lamport 
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and  Melliar-Smith  use  this  technique  in  their  Byzantine  Agreement  protocols  foi'  clocks,  which  are 
based  on  Byzantine  Agreement  protocols  [Lamport  et  al.  82]  that  take  the  median  of  the  set  of  values 
received,  and  hence  do  not  use  equality  of  values.  The  second  way  to  avoid  the  inequality  of  clock 
copies  problem  is  to  consider  ?  collection  of  clocks  "equal"  if  all  are  within  2A  of  some  clock  value 
in  that  collection.  (Recall,  a  is  the  maximum  clock  reading  error  between  arv  pair  of  processes.) 
Mahanev  and  Schneider  use  this  approach  to  modify  the  Crusaders  Agreement  protocol  of  [Lolev 
82],  which  uses  equality  of  values,  to  handle  clocks  [Mahaney  &  Schneider  85]. 

5.2.  Obtaining  Faster  Convergence  by  Agreement 

The  Crusader’s  Convergence  Algorithm  CFcca  of  [Mahaney  &  Schneider  85]  is  the  result  of 
employing  Crusader’s  Agreement  to  disseminate  values  before  applying  CFFca  • 

Crusader’s  Convergence:  CFcca  Is: 

(1)  Each  processor  employs  the  Crusader’s  Agreement  protocol  to  disseminate  its  clock. 

(2)  The  value  of  CFcca  at  processor  p  is  the  result  of  p  applying  CFFCa  to  the  set  of  clocks 
received. 

CFcca  has  half  the  precision  of  CFfca  (i.e.  convergence  is  twice  as  good)  because  due  to  CRU1  of 
Crusaders  Agreement,  it  is  not  possible  for  correct  processors  p  and  q  to  use  values  for  cr(t)  that 
differ  by  more  than  2A  unless  one  of  p  and  q  "knows”  that  r  is  faulty,  in  which  case  it  can  ignore  cr(t ) 
completely.  CFCca  has  the  same  accuracy  and  degree  of  fault  tolerance  as  CFfca  .  It  is  interesting  to 
note  that  when  CFFCa  is  iterated  twice — which  requires  the  same  two  rounds  of  message  exchange  as 
the  Crusaders  Agreement  used  in  CFcca — the  worst  case  precision  is  46/9,  clearly  inferior  to  the  5/3 
precision  achieved  when  the  two  rounds  of  message  exchange  is  used  for  a  Crusader’s  Agreement. 
Employing  Crusader’s  Agreement  before  CFea>  CF^id  and  CFAvg  also  results  in  precision  improve¬ 
ments  for  those  convergence  functions. 

When  a  Byzantine  Agreement  is  used  to  disseminate  clocks,  all  correct  processors  agree  within 
2 A  on  an  approximation  for  the  clock  at  each  processor,  due  to  BYZ1  and  the  error  bounds  in  approx¬ 
imating  clocks.  Correct  processors  evaluating  a  convergence  function  will  then  differ  by  at  most  2A 
in  values  in  corresponding  argument  positions.  Define  Selg  to  be  a  function  that  returns  its  gth  largest 
argument.  If  we  employ  a  Byzantine  Agreement  protocol  that  can  tolerate  k  failures  to  disseminate 
arguments  used  in  Selk+\ ,  then  we  obtain  a  convergence  function  CFByt  for  clock  synchronization: 

Byzantine  Convergence:  CFByi  is: 

(1)  Each  processor  employs  the  Byzantine  Agreement  protocol  to  disseminate  its  clock. 
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(2)  The  value  of  CFgyt  at  processor  p  is  the  result  of  p  applying  Selk+i  to  the  set  of  clocks 
received. 

Provided  there  are  k  or  fewer  failures,  Selk+\  at  a  correct  processor  p  selects  a  clc  nk  that  is  guaranteed 
to  read  within  e=2A  of  the  clock  selected  by  every  other.  This  means  that  the  precision  of  CFByt  is 
e)=2A — the  precision  for  the  convergence  function  is  independent  of  8!  To  bound  the  accu¬ 
racy,  note  that  because  k<g  <N-k,  the  g,h  largest  clock  is  either  r.  correct  clock  or  lies  between 
correct  clocks.  If  correct  clocks  are  within  5,  then  the  new  clock  is  no  more  than  5  away  from  a 
correct  clock,  so  wc  conclude  that  the  accuracy  of  the  algorithm  is  0^(8) =8. 

Clock  synchronization  algorithms  based  on  Byzantine  Agreement  are  described  in  [Lamport  & 
Melliar-Smith  84]  and  analyzed  in  [Lamport  &  Melliar-Smith  85]. 

5.3.  fireworks  Agreement:  An  Optimization 

When  CFByt  is  used  as  a  convergence  function,  only  the  largest  k+ 1  clocks  are  actually  needed. 
(Only  the  k+lsl  largest  clock  is  returned,  but  to  decide  which  clock  is  the  k+]*‘  largest,  the  k+l  larg¬ 
est  clocks  are  needed.)  Since  performing  a  Byzantine  Agreement  can  be  costly — in  both  delay  and 
number  of  messages  exchanged — avoiding  Byzantine  Agreements  on  the  other  clocks  is  desirable. 
We,  therefore,  propose  a  somewhat  weaker  form  of  agreement  to  take  the  place  of  the  Byzantine 
Agreements  used  in  connection  with  CFByt.  This  new  form  of  agreement,  which  we  call  a  Fireworks 
Agreement,  effectively  allows  correct  processors  to  agree  on  the  value  of  a  single  correct  clock  by 
causing  all  to  terminate  the  protocol  at  approximately  the  same  (teal)  time: 

FW:  AU  correct  processors  terminate  with  some  a  priori  decided  value  v  within  p  real 

seconds  of  each  other. 

The  name  Fireworks  Agreement  is  in  analogy  with  a  public  fireworks  display,  where  participants 
agree  on  when  the  display  is  over.  In  a  fireworks  display,  P  is  non-zero  if  observers  are  different  dis¬ 
tances  from  the  pyrotechnics;  in  a  distributed  system,  p  is  related  to  message-delivery  times. 

In  describing  a  protocol  to  implement  Fireworks  Agreement,  we  will  assume  that  it  is  possible 
for  a  correct  processor  to 

A 1 :  authenticate  the  sender  of  every  message  it  receives  and 

A2:  to  determine  whether  a  message  it  receives  was  modified  by  processors  that  relayed  the 

message. 

These  assumptions  are  satisfied  if  digital  signatures  are  employed  by  the  sender  of  a  message  or  if 
fewer  than  1/3  of  the  processors  are  faulty  and  the  simulated  authentication  technique  of  [Srikanth  & 
Toueg  84]  is  used  to  transmit  messages.  In  either  case  (i)  faulty  processors  are  unable  to  masquerade 


as  correct  processors  and  (ii)  faulty  processors  are  unable  to  modify  and  then  retransmit  messages 
received  from  correct  processors. 


The  following  protocol  implements  a  Fireworks  Agreement  for  a  message  with  value  T.  The 
protocol  is  specified  for  a  processor  p  and  described  as  two  rules,  each  of  which  might  be  imple¬ 
mented  as  a  separate  process.  The  term  "sufficient  evidence"  of  rule  (2)  is  defined  below. 

(1)  When  Cp(r)=7,  processor p  signs  and  broadcast®  (T,  p)  to  all  processors  (including  itselD. 

(2)  Upon  receiving  "sufficient  evidence",  p  broadcasts  that  evidence  to  all  processors  and  ter¬ 
minates  the  protocol. 


Two  different  schemes  have  been  proposed  for  determining  when  there  is  "sufficient  evidence"  as 
required  in  rule  (2).  Before  turning  to  the  details  of  these,  we  show  that  any  scheme  satisfying  the 
following  properties  leads  to  termination  of  the  protocol  by  all  correct  processors  within 
P  =  r^/fl-p)  real  seconds: 

Achievement  of  Sufficient  Evidence:  Some  correct  processor  eventually  determines  that  there 
is  "sufficient  evidence". 

Criterion  for  Sufficient  Evidence:  Evidence  that  is  considered  sufficient  by  a  correct  processor 
p  and  rebroadcast  is  considered  sufficient  by  any  correct  processor  receiving  that  broadcast. 


According  to  Achievement  of  Sufficient  Evidence,  eventually  some  correct  processor  will  deter¬ 
mine  that  there  is  "sufficient  evidence".  Suppose  p  is  the  first  to  terminate  and  does  so  at  real  time 
rMl.  According  to  n\le  (2)  above,  it  must  have  broadcast  its  "sufficient  evidence"  to  all  processors.  In 
the  worst  case,  there  are  no  other  undelivered  messages  in  the  network:  when  p  makes  that  broadcast. 
Thus, p’s  "sufficient  evidence"  can  take  as  long  as  r^/fl-p)  real  seconds  to  be  received  by  another 
correct  processor  q  and  therefore  can  be  received  as  late  as  real  time  f,* +rma*/(l-p).  According  to 
Criterion  for  Sufficient  Evidence,  q  must  also  consider  this  "sufficient  evidence",  and,  according  to 
mle  (2),  terminate  the  protocol.  Thus,  by  fM/+rmflX/(l-p)  all  correct  processors  have  terminated  the 
Fireworks  Agreement  and  we  conclude  P=rma,/(1  -p). 


Independent  of  the  refinement  of  "sufficient  evidence",  Fireworks  Agreement  is  used  in  con¬ 
structing  a  convergence  function  CFfw  as  follows.  For  the  itk  Fireworks  Agreement,  we  use 
r=(i-l)fl  where  (as  in  section  2) 


(1+p)  (1-p) 


s  r. 


And,  for  the  value  of  CF fw(p>  ...)  associated  with  the  ilh  Fireworks  Agreement  we  use: 
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! 


(5.1) 


CFfwiPx  ct(i£)+v, ...  cw(fj»)+v)  ■  (i-l^+r^+v  for  OS v. 

Note  that  Monotonicity  and  Translation  Invariance  hold  for  CFfw  by  definition. 

To  bound  precision  w £)  of  CFp#,  substituting  into  the  definition  of  precision,  we  get: 

jc/tv(8, e)  £  \CFFW(p,ci(t\,),...,CN(t]l))~‘CFFW(q,ci(jt,ll) . cjv(r^) I .  (5.2) 

Without  loss  of  generality,  suppose  t‘p  <tj,  so  that  due  to  Monotonicity  (5.2)  simplifies  to 

6)  ^  CFpwQ),  ci  (/^) . cjv(ff))  -  CFpn/iq,  (f^) . c^(^)).  (5.3) 


Using  (5.1)  with  v=(f^-rj,)(l+p)  we  get: 

CFfw(p, ci (t‘)+(f‘-r*Xl+P) . cw(4)+(ri~rj)(l+p))  =  (i-l)*+r^+(4-4)<1+P)-  (5.4) 

Equation  (5.4)  is  now  simplified  as  follows.  First,  because  Ci  (r^)Sci  (tp)+(4~rj,)(l+p)  due  to  Vir¬ 
tual  Rate  (2.4),  we  conclude  using  Monotonicity  that 

CFpwfp,  ci  (<J), ....  cw(4))  £  CFmfp,  ci  (tp)+(^-tpXl+p), ....  cw(^)+(4  -rj)(l+p)). 

Therefore,  transitivity  with  (5.4)  yields, 

CFpwtp,  c,  (4) . c*(f‘ ))  S  (i-DR+r^+^-^Xl+p). 

By  definition  of  p,  r^-fpSp.  Making  this  substitution  into  the  previous  equation  results  in 

CFpwtp,  ci  (*J). ....  c*(f‘))  S  (i-DR+r^+Pd+p). 

Substituting  this  into  (5.3)  gives  a  bound  for  Jtfw(8,  e): 

*fw(8,  e)  £  ((/-DR+T^+Pd+p))  -  CFwiq,  c,  (»;), . cw<4)).  (5.5) 

By  definition  (5.1).  CFpwiq,  ?i  (4), ....  cN(tlq))  ■  (j'-l)/?+r,*„.  We  use  this  to  simplify  (5.5) 
further,  obtaining 

%(8,t)  *  (o-i^+r^+pd+p))  -  (o-DR+r^) 

*  P(i+p) 

^  f'mar 
2!  — (1+p). 

1-P 


Sufficient  Evidence 

One  characterization  of  "sufficient  evidence",  which  is  the  basis  for  the  clock  synchronization 
protocol  of  [Halpem  et  ai  84],  exploits  the  fact  that  the  clock  at  a  correct  processor  must  be  within  8 
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of  die  clock  at  any  other  correct  processor.11 


SE1 :  Receipt  of  a  message  m®(7\  q)  by  p  is  considered  sufficient  evidence  iff  m  is  correctly 

signed  by  s  1  processors  and  received  by  p  at  real  time  tnv  such  that 

T-s  (8+r^i.)  &  c„(trv,)  *  T+s  (5+r^).  (5.6) 

To  show  that  SE1  satisfies  die  Criterion  for  Sufficient  Evidence,  suppose  a  Fireworks  Agree¬ 
ment  terminates  atp  at  time  cp(tm)  due  to  receipt  of  a  message  m.  Thus,  (5.6)  holds.  We  must  show 
that  (S.6)  will  hold  whenever  m  is  forwarded  to  another  correct  processor  q.  Thus,  we  must  show  that 

r-fr+ixs+r*,)  <  cn(f^)  £  r+(s+i)(6+r^) 

holds,  where  is  the  time  that  q  received  the  copy  of  m  forwarded  by  p.  Since  p  and  q  are  both 

A 

correct,  I  cp(tm)-cf(tm)\ £&.  Therefore,  we  can  rewrite  (5.6)  in  terms  ofc/r^) 

T-s (S+I^J-S  £  cf(fw)  ST+s (8+rmax)+8.  (5.7) 

Since  at  real  time  t^,  p  forwarded  the  evidence  to  q,  by  the  definitions  of  and  we  have 

(frcv)+r'm«n  S  Cf(frcv2)  ^  ^f(tfev)+r'n«u*  (5.8) 

We  can  now  substitute  in  (5.8)  for  ?,((,„)  using  (5.7)  and  obtain 
T— s  (5+Fauii)— 6+r^u,  £  c^(rrcv 2)  S  T+s  (8+ rmax)+8+r/^ x, 
which,  since  the  copy  of  m  forwarded  to  q  by  p  contains  one  more  signature,  implies  (5  6). 


It  only  remains  to  show  that  SE1  is  eventually  satisfied,  hence  Achievement  of  Sufficient  Evi¬ 
dence  holds.  The  argument  is  simple.  A  correct  processor  executing  rule  (1)  of  the  protocol  will 
receive  a  copy  of  the  message  it  has  broadcast.  This  copy  wi>l  satisfy  (5.6)  because  ;t  will  arrive 
between  and  r***  clock  seconds  after  it.  was  sent 


Accuracy  ctsEi  (8)  for  SE1  is  illustrated  as  follows.  Suppose 
p  is  the  correct  processor  with  the  fastest  clock, 
q  is  a  faulty  (even  faster)  processor  such  that  c9(t)-cp(t)  =  8,  and 
r  is  the  correct  processor  with  the  slowest  clock  and  therefore  cp(t)-cr(t)  =  5. 

Further,  suppose  q  executes  rule  (1)  at  time  c7(f^)=r  and  broadcasts  a  message  m=(T,  q).  By 

A 

definition  of  p  and  q,  Cpit^-T-b.  The  message  will,  therefore,  be  delivered  to  p  by 


“The  protocol  of  [Criitian  tt  at.  86]  also  uses  a  variant  of  this  form  of  "sufficient  evidence".  However,  the  test  used 
there  is  simpler  than  the  one  discussed  here  because  their  protocol  tolerates  only  omission  failures — not  full  Byzantine 
failures. 
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r-8+r^  &  Cp(trcy)  £  T- 8+r^,*  and  p  will  find  the  message  to  be  sufficient  evidence,  because  it 
satisfies  (5.6).  By  definition  of  p  and  r,  we  have 

r-s+r^-6  *  cr(tw)  s  r-s+r^-6. 

Therefore,  when  r  receives  the  cop f  of  the  message  rebroadcast  (according  to  rule  (2))  by  p,  that  time 
cr(W)«  given  by 

r—S+r^— 8+r^i,  s  cr(frcv2)  s  t~  6+r(WM— s+rm**. 

The  message,  therefore,  satisfies  (5.6)  and  is  sufficient  evidence  for  r  to  terminate.  Moreover,  since 
r_5+r^-6+r^  s  ^(rmd)  and  according  to  the  protocol  (i.e.  (5.1))  r  must  set  its  clock  ahead  to 
T+T^,  r  might  therefore  have  to  set  it  ahead  by  as  much  as 

(r+r^)  -  (r-S+r^-s+r^). 

We  conclude 

o Isei($)  *  rmax+2(8-rm„). 

Accuracy  a^AO)  reveals  a  problem  with  SE1:  A  faulty  processor  (i.e.  q)  with  a  fast  clock  can 
cause  clocks  at  correct  processors  to  reset  so  that  they  run  faster  than  they  should.  (The  consequences 
of  this  are  quantified  in  the  Appendix.)  On  the  other  hand,  SE1  has  fault-tolerance  degree  N-\ 
because  it  was  not  necessary  to  stipulate  an  upper  bound  on  the  number  of  faulty  processors. 

A  second  characterization  of  "sufficient  evidence”,  first  used  in  the  clock  synchronization  proto¬ 
col  of  [Srikanth  &  Toueg  G4i12,  is  based  on  the  fact  that  if  every  processor  broadcasts  a  message 
when  its  clock  reads  T,  then  provi  led  there  ire  at  most  k  faulty  processors,  the  k+ 1"  message 
received  must  be  from  a  corr*ct  one  or  must  follow  a  message  from  a  correct  one. 

SE2:  Receipt  of  ,'c+l  messages  originated  by  distinct  processors  is  considered  sufficient  evi¬ 

dence. 

It  is  easy  tc  see  that  SE2  satisfies  our  Criterion  for  Sufficient  Evidence — even  after  being  forwarded 
to  another  processor,  the  *-*-1  messages  used  for  sufficient  evidence  at  one  processor  are  still  ori¬ 
ginated  by  ifc+1  distinct  processors,  so  they  will  be  coa'idered  sufficient  evidence  at  another.  Ensur¬ 
ing  Achievement  of  Sufficient  Evidence,  requires  making  an  assumption  about  the  number  of  faulty 
processors.  SE2  is  guaranteed  to  hold  only  if  N£  2k +1  because  then  there  are  fewer  than  k+ 1  faulty 
processors  and  at  least  k+ 1  correct  ones.  Thus,  fault-tolerance  degree  k=(N- 1)/2. 


**A  similar  scheme  was  later  used  in  the  protocr’  of  [Babaoglu  &  Drummond  87]. 
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The  feet  that  when  some  processor  receives  sufficient  evidence  according  to  SE2  it  must  have 
received  a  message  from  a  correct  processor  means  that  the  accuracy  of  SE2  is  better  than  that  of 
SE1.  A  scenario  that  achieves  worst-case  accuracy  with  SE2  is  given  by  the  following.  Suppose, 

PuPi . Pk  are  correct  processors  with  fast  clocks, 

Pk+\  is  a  faulty  processor  with  a  fast  clock,  and 

r  is  the  correct  processor  with  the  slowest  clock,  so  (Vi:  l£»£*-»-l:  cft(r)-cr(r)»8). 

Further,  suppose  each  processor  p„  liiik+l  broadcasts  a  message  when  . (r ) = (r — 1 )/?  =  r. 
Thus,  these  messages  are  sent  at  time  ^(r^J-T-S  and  can  be  received  by  r  as  early  as  time 
CrCfrtv^T-S+r,**,.  The  set  of  *+1  messages  broadcast  by  px  through  p*+i  satisfy  SE2,  so  r  must 
advance  its  clock  by  as  much  as 

CFMr,  c  1(4).  -c/r(ti))  -  OT-8+r^) 

^(r+r^)  -  (T-5+r^) 

and  we  conclude 

as£2(S)  =  r^+5-r^. 

Clearly,  accuracy  with  SE2  is  superior  to  that  achieved  with  SE1.  This  is  not  without  cost,  however. 
SE2  requires  that  fewer  than  half  the  processors  are  faulty;  SE1  makes  no  assumptions  about  the 
number  of  faulty  processors. 

Clock  synchronization  algorithms  based  on  Fireworks  Agreement  arc  interesting  because  a  pro¬ 
cessor  cannot  even  evaluate  CF  without  causing  every  other  correct  processor  to  resynchronize  its 
clock.  Thus,  the  convergence  function  provides  an  implementation  of  both  RTS1  and  RTS2;  the  con¬ 
vergence  functions  discussed  earlier  provided  an  implementation  of  RTS2  only.  On  the  other  hand, 
inherent  in  Fireworks  Agreement  is  that  processor  clocks  are  read  in  the  less  accurate  of  the  two  ways 
presented  in  section  3.  Moreo»er,  while  it  is  possible  to  achieve  precision  of  2A  using  an  agreement 
algorithm  (i.e,  CF^),  CFpw  does  not  come  close.  The  precision  of  CFpw  depends  on  the  maximum 
message  delivery  delay,  while  precision  of  CFByt  is  determined  by  the  variance  in  message  delivery 
delay. 

6.  Discussion  and  Conclusions 

We  have  discussed  clock  synchronization  protocols  that  can  be  viewed  as  refinements  of  a  sin¬ 
gle  paradigm.  The  paradigm  is  based  on  postulating  a  reliable  time  source  that  periodically  issues 
messages  to  cause  processors  to  synchronize  their  clocks.  Implementing  the  reliable  time  source 
involves  solving  three  subproblems.  Different  solutions  to  these  subproblems  yield  different 


protocols. 


The  first  subproblem  defined  by  our  paradigm  is  to  generate  events  that  cause  all  processors  to 
resynchronize.  Any  solution  to  this  subproblem  can  be  characterized  in  terms  of  three  constants: 
and  t  ^  bound  the  real-time  interval  that  can  elapse  between  when  the  first  correct  processor  to 
resynchronize  for  the  ilh  time  does  so  and  when  the  first  correct  processor  to  resynchronize  for  the 
t+lM  time  does  so.  3  bounds  the  real  ume  that  can  elapse  between  when  the  first  correct  processor 
resynchronizes  for  the  i,k  time  and  when  the  last  correct  processor  resynchronizes  for  the  i,k  time. 

The  second  subproblem  defined  by  our  paradigm  is  how  a  program  being  executed  by  one  pro¬ 
cessor  can  read  the  clocks  on  another.  A  solution  to  this  subproblem  is  characterized  in  terms  of  A, 
an  upper  bound  on  clock  reading  error. 

The  final  subproblem  defined  by  our  paradigm  is  choice  of  a  convergence  function.  Any  func¬ 
tion  that  satisfies  the  four  properties  given  in  §4— Monotonicity,  Translation  Invariance,  Precision 
Enhancement,  and  Accuracy  Preservation — will  work.  Such  a  function  is  characterized  by  its  preci¬ 
sion  k,  which  bounds  how  closely  it  will  bring  values  together,  and  its  accuracy  a,  which  bounds  how 
far  its  result  will  be  from  its  argument. 

If  processor  clocks  run  close  together  but  far  from  real  time,  clocks  implemented  by  an  algo¬ 
rithm  based  on  our  paradigm  will  remain  synchronized  with  each  other  but  will  diverge  from  real 
time.  In  order  to  construct  a  clock  synchronization  algorithm  that  keeps  clocks  close  to  real  time,  the 
reliable  time  source  must  remain  close  to  real  time.  Various  international  standards  organizations 
maintain  highly  accurate  synchronized  clocks.  In  the  United  States,  WWV  60  KHz  radio  broadcasts 
provide  a  time  signal  accurate  to  a  few  milliseconds,  as  does  the  GEOS  satellite.  (WWV  broadcasts 
at  5,  10,  and  15  MHz  are  accurate  to  only  100  milliseconds,  due  to  uncertainty  in  propagation 
delays.)  Employing  radio  receivers  to  inject  such  correct  real  times  into  a  distributed  system  is  one 
way  to  provide  the  needed  source  of  time.  Algorithms  for  clock  synchronization  when  an  external 
source  of  time  is  available  are  described  in  [Marzullo  &  Owicki  83],  [Marzullo  84],  and  [Lamport 
85]. 

The  fact  that  so  many  clock  synchronization  algorithms  can  be  viewed  in  terms  of  a  single  para¬ 
digm  was  a  surprise.  Previously,  clock  synchronization  algorithms  were  viewed  in  terms  of  three 
classes:  those  based  on  convergence,  those  based  on  agreement,  and  those  in  the  style  of  [Halpem  et 
ai  84],  It  was  pleasing  to  discover  that  all  the  published  algorithms  can,  in  fact,  be  viewed  in  terms 
of  a  single  paradigm  based  on  convergence  functions.  In  addition,  viewing  algorithms  as  refinements 
of  a  single  paradigm  allows  their  performance  to  be  compared.  Performance  of  a  clock  synchroniza¬ 
tion  algorithm  based  on  convergence  functions  is  characterized  by  ic,  a,  and  the  cost  of  computing  the 
underlying  convergence  function.  Thus,  by  defining  the  notion  of  a  convergence  function  and  giving 


•  framework  in  which  its  performance  can  be  quantified,  we  have  made  it  possible  to  compare  exist¬ 
ing  algorithms  at  well  as  given  insight  into  the  construction  of  new  algorithms. 


Appendix  I:  Proof  of  Clock  Synchronisation 


This  section  gives  sufficient  conditions  to  ensure  that  the  clock  synchronization  protocol  of  Fig¬ 
ure  2.1  satisfies  correctness  conditions  Virtual  Synchronization  (2.3)  and  Virtual  Rate  (2.4).  We 
assume  only  the  following  about  the  solutions  used  for  the  three  subproblems  left  open  in  that  proto¬ 
col. 

Event  Generation.  and  are  the  lower  and  upper  bounds  for  the  real-time  interval  that 
can  elapse  between  when  the  first  correct  processor  to  resynchronize  for  the  i,k  time  does  so  and 
when  the  first  correct  processor  to  resynchronize  for  the  i+1*  time  does  so.  3  bounds  the  real 
time  that  can  elapse  between  when  the  first  and  last  correct  processor  resynchronizes  for  the  i,h 
time. 

Clock  Reading.  A  is  an  upper  bound  on  the  error  associated  with  the  value  obtained  when  a 
program  executing  on  one  processor  reads  the  clock  on  another. 

Convergence  Function.  CF  has  precision  re,  has  accuracy  a,  and  satisfies  the  Monotonicity, 

Translation  Invariance.  Precision  Enhancement,  and  Accuracy  Preservation  Properties  of  §4. 

To  simplify  the  exposition  that  follows,  p,  q,  r,  and  x  are  assumed  to  range  over  correct  proces¬ 
sors  only.  | 

! 

Synchronization  of  Virtual  Clocks 

To  prove  that  Virtual  Synchronization  (2.3)  is  satisfied,  we  stan  by  establishing  that  all  correct 
processors  have  started  their  i,k  virtual  clocks  by  the  time  the  first  correct  processor  starts  its  r+1" 
virtual  clock.  This  is  necessary  in  order  to  be  able  to  execute  the  assignment  to  adjf1  in  the  proto¬ 
col.  i 

Lemma  1:  Let  t'*1  =(min  r:  rj+l).  If  P Sr ^  then  for  any  correct  processor/?,  r^Sr^1.  - 

Proof:  Letri=(minr:  t‘r).  By  the  definition  of  r^„  in  RTS1.  -r,.  Adding  t‘x  to  both 

sides,  we  get  r^+riSr^1.  The  hypothesis  that  implies  ri+psr^+ri,  so  by  transi¬ 

tivity  ri+pSr^u.+riSr^1.  Moreover,  from  the  definition  of  P  in  RTS1,  r^Sri+p,  so  again  by 
transitivity  r^iri+BSr^+riS^1.  □ 

i 

! 

We  now  prove  that  virtual  clocks  that  employ  instantaneous  resynchronization  (i.e.  A/Sic )  j 

satisfy  Virtual  Synchronization  (2.3).  Define  j 

Cp(0  ■  CpiO+adfi.  ; 

l 

And,  as  before,  let  cy(r)  be  the  value  of  c,(r)  where  i  satisfies  r^Sr<r^*.  The  proof  of  Virtual 

i 

l 

l 

I 
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Synchronization  (2.3)  for  c,  is  in  two  step.;.  The  first  step  (Lemma  2)  shows  that  when  the  last 
coru  n  processor  to  start  its  t,k  virtual  clock  does  so,  the  i,k  virtual  clocks  at  all  correct  processors 
wiil  be  close  together,  the  second  step  (Lemma  3)  extends  this,  showing  that  this  implies  that  correct 
virtual  clocks  will  remain  close  together. 

Lemma  2:  If  (i)  psr^ 

(ii)  u£8s 

(Hi)  K(5s+2pp,2(p(l+p)+A)^8s 
then  (Vi:  0<i:  t‘x=max(tp,tj,)  =>  lcf(fi)-cp(ri)l  S8s). 

Proof:  By  induction  on  i. 

Base  Case:  (Vi:  0<i£l:  ri=max(ri,,r^)  =s  lc,(ri)-cp(ri)l  £§,$). 

0£cp(0)£u  due  to  Hardware  Initial  Value  (2.1). 

Q£cp(0)£u  because  adfp  =  0  (see  Figure  2.1). 

0£c,(0)£u  same  argument  for  processor  q. 

0 £cp(ti)Zu  and  0£c,(ri)Su  since  rp=4=0=max(fp,i‘)=r], 

I  cq(tl)-cp(tl)  I  <, u  substituting  with  previous  line. 

I  cq (ri)-Cp(ri)  I  £5S  due  to  hypothesis  (ii). 

(Vi:  0<i£l:  ri-max(rj,,r^)  =>  lc?(ri)-cp(ri)l  £5$). 

Induction  Case.  As  an  Induction  Hypothesis  assume: 

(Vi:  0 <!£i;  ri  =max(r',/J)  =>  I  cf  (&-?,(& ! JS8s).  (Al) 

According  to  the  protocol  of  Figure  2.1,  the  definition  of  cp+1 ,  and  the  fact  that  reading  the  clock 
at  another  processor  has  an  associated  error,  we  have: 

cf'itf')  =  CF(p ,  c{  ....  chCt^+Xpm  (A2) 

^+1(4+1)  =  CF(q,c{  (4+1)+X,(  1), ....  cW*)+\m  (A3) 

The  arguments  to  CF  in  both  (A2)  and  (A3)  are  defined  (and  therefore  can  be  computed)  due  to 
Lemma  1  and  hypothesis  (i).  Without  loss  of  generality,  assume  tlqx  £rp+1.  For  correct  proces¬ 
sors  p  and  q,  we  conclude 

c*+1(4+1)  £  c*+,(^,)-Kr^,-ri+,)(l+ p) 

due  to  Hardware  Rate  (2.2).  Using  Monotonicity  of  CF,  we  substitute  for  cq+x(t‘q+x)  in  this  for¬ 
mula  based  on  (A3)  and  obtain 


(A4) 


cj+V)  s  cF(q,  c{  «i;l)+\(i) . +  (t‘p+l -4+1xi+p). 

Since  r^+1  -f^+1 £p  due  to  RTS1  (§2),  (A4)  can  be  simplified  to 

c?%+1)  *  CF(q,  ?{'(f‘+,)+X,(l), ...,  +  p(l+p). 

Translation  Invariance  allows  the  p(l+p)  term  to  be  moved  inside  CF,  resulting  in 

c*+,(f‘+1)  £  CF(q,  c/(f‘+1)+p(l+p)+X,(l) . c^fl)+PO+P)+ (A5) 

We  can  now  use  the  Precision  Enhancement  Property  for  CF  to  show  that 
fi+1=max(/^+i,^+1)=»  lc^+,(ri+,)-ciJ+,(ri+1)IS8sl  as  required  to  establish  the  Induction  Case. 
By  assumption,  r^+l  £rj,+1  so  it  suffices  to  prove  lc^+1(t)»+l)-Cp+1(t).+1)l  S  6$  to  establish  the 
Induction  Case.  To  do  so,  we  first  determine  constants  e  and  5  for  the  Precision  Enhancement 
Property. 

To  characterize  e,  note  that  due  to  Hardware  Rate  (2.2)  and  the  fact  that  fj,+1-f^+1  £  p,  for 
each  correct  processor  a, 

P(i-p)  £  (r*,+1-r‘+1)<i-p)  £  ci(^1)-c‘(4+1)  £  (r^-r^xi+p)  £  p(i+p). 

Also,  from  the  definition  of  A, 

(Vb:  I  X«(b)  I  <  A) 

Therefore,  the  difference  between  the  value  in  equation  (A5)  of  the  r,h  argument  to  CF  and  in 
equation  (A2)  for  any  correct  processor  a  can  be  at  most  e=2(P(l+p)+A). 

To  characterize  8  of  the  Precision  Enhancement  Property,  note  that  by  Induction 
Hypothesis  (Al)  we  have  for  correct  processors  a  and  b 

ri=max(4,ri)  =>  lca(ri)-c*(ri) .  £8S.  (A6) 

Without  loss  of  generality,  assume  cj(fi)£c£(ri).  Thus  for  the  a‘h  and  b’h  arguments  to  CF  in 
(A2): 

6=  lri(^,)-ci(r*r1)l 
=  W+1)-c’(r*+l) 

£  c(,(t‘x)  y{rp*1  -fi)(l+p)  -  (cj(fi)+(tp+1  -fi)(l-p))  due  to  Hardwai?  Rate  (2.2)  since 

4<tp+l  by  Lemma  1. 

£  ci(ri)-Ca(ri)+2pp  algebra  and  the  definition  of  p. 

£  8j+2pp  due  to  (A6). 

Using  (A5)  to  characterize  c^+1(fj>+1)  and  (A2)  to  characterize  Cp+X(t^1 ).  we  get 
,c‘+l(rj,+l)-c;+1(^1)l  £  CF(q,c{ (rjr+1)+P(l+p)+X,(l) . c^(r‘+1)+P(l+p)+X,(A)) 


-  cF(p,c[  (^+1)+\p(i) . m^)+xpm) 

£  ft(8$+2pp,  2(p(l+p)+A))  by  Precision  Enhancement  Property 
£  8 Is  by  Hypothesis  (iii). 

This  completes  the  Induction  Case.  □ 

Lemma  3:  If  (i)  p  £ 

(ii)  «£8S 

(iii)  Jt(8s+2Pp,  2(P<l+p)+A))£8c 

(iv)  8s+2p(r)WK+P)  S  8 

(v)  a(8s+2p(r(Mfll+p))+2pp  £  8 

then  (Vr:  0£r:  lcf(r)~Cp(OI£$)> 

Proof:  The  conclusion  of  the  lemma  is  equivalent  to 

(Vi:  0<I:  (Vr:  max(4,4)^<min(^+1,4+1):  \c‘(t)-c‘.(t)\$E))  a  (A7) 

(Vi:  0 <i:  (Vr:  min(r^1  ,r^+1)Sr <max(r^‘ ,4+1):  rj+,sij+1  =>  lc*l(r)-c;(r);£8  a 

rj,+1  £ =»  I  cjjiO-Cp*1  (r)  I  £8))  (  } 

We  first  prove  (A7).  Due  to  hypothesis  (i)  -  (iii)  we  can  use  Lemma  2  to  conclude: 

(Vi:  0<i:  ri=max(r),,r^)  =>  Ic,(ri)-cp(ri)is8s). 

According  to  Hardware  Rate  (2.2),  correct  clocks  drift  apart  no  more  than  2p  clock  seconds/real 
second,  and  therefore 

(Vi:  0 <i:  (Vr:  max^.r^lr:  lcl|(r)-Cp(r)l^8s+2p(r-max(rp,  r^)))). 

This  implies 

(Vi:  0<i:  (Vr:  max^.r^srSmin^.r^1):  I  c^O-c^r)  I  S8s+2p(r^„+p))).  (A9) 

because  r^-Pimintr^.r^^-max^.r^Sr^+P  due  to  RTS1  (§2).  Using  Hypothesis  (iv), 
(A9)  can  be  simplified  to 

(Vi:  0 <i:  (Vr:  max^.r^SrSmin^1,^1):  lc^(r)-Cp(r)l  £8)), 
which  implies  (A7)  as  desired. 

To  prove  (A8),  without  loss  of  generality  we  assume  that  tf1  £t‘p+l .  Thus,  (A8)  is 
equivalent  to 

(Vi:  0 <i:  (Vr:  min(r^1,4+,)Sr<max(r*>+1,4+1):  lc‘+1(r)-c;(r)l  £8))  (A10) 


-27- 


and  it  suffices  to  prove  that.  To  do  this,  we  infer  from  (A9) 


(Vi:  0<i:  (Vt:  t-mintt*1,#1):  I c^(f)-c "(/) I  £8s+2p(/wK 


il)). 


(All) 


Therefore,  we  can  take  8  in  the  definition  of  accuracy  a  to  be  5=5i'f2p(rmiU+P)  and  using  the 
Accuracy  Preservation  Property  obtain  a  bound  for  how  c^+1(r^+1)  differs  from  any  argument  to 
CF  used  in  calculating  c^+4(f£ !).  Since  Cp(r^+1)>  must  have  been  such  an  argument: 


(Vi:  0<i:  (V/:  r=min(e^1,^1):  lc‘+1(r)-c;(OI  Sa(8s+2p(r^+P)))). 


lliis  implies  that 

(V:.  0<i:  (Vt:  min(^+1,^+1)Sr:  , 

lc*+,(r)-c;0)l  S  o(5s+2p(rW(tt+i3))+2p(r-mih(^l,r‘+1))))  (A12) 

due  to  Hardware  Rate  (2.2).  From  the  definition  of  (3  in  RTS1, 
0£maxOj,+1  ,^+1)-min(fp+1  ,fj+,)£ p,  so  equation  (A12)  implies 

(Vi:  0 <i:  (Vr:  min(fj,+1  ,r^+‘)^r <max(fp+1,r^+1):  | 

lc*+1(0-c;(r)l  £a(8s+2p(rmax+p))+2pP)). 

* 

Substituting  for  a(6J+2p(rmiu+p))+2pp))  according  to  Hypothesis  (v)  yields 
(Vi:  0 <i:  (Vt:  min(rj,+1  ,r^+1  )£r  <max(rj,+1  .r^+1 ):  lc^+1(r)-c^(r)l  S6)) 
as  was  required  (i.e.  (A10))  in  order  to  prove  (A8).  □ 


The  previous  lemma  established  that  virtual  clocks  using  instantaneous  resynchronization 
satisfy  Virtual  Synchronization  (2.3).  We  now  prove  that  virtual  clocks  using  continuous  resynchron¬ 
ization  also  satisfy  Virtual  Synchronization  (2-3).  Define  ai  to  be  the  maximum  number  of  real 
seconds  it  takes  for  adjustment  interval  AI  to  elapse  at  any  correct  processor.  Thus,  ai=Al/(  1-p). 
Further,  define  a  fixed  clock  clp  to  be  a  function  from  real  time  to  clock  time  satisfying13 

FC1:  (Vt:  tpl+ai<t:  cfl(t)=cf\t))  and 

FC2:  (Vt:  i£+l  ZtZtj,*1  +ai:  ej*l(t)e  [c&.cf'm. 

Thus,  outside  of  its  adjustment  interval,  the  value  of  cp*l(t)  is  the  same  as  cp*l(t)\  and  during  its 
adjustment  interval,  the  value  of  cp+l(t)  is  guaranteed  to  lie  between  the  value  of  cp(t)  and  cp+l(t). 

From  FC1  and  FC2,  we  conclude  that  in  order  to  prove  for  any  given  D 
(Vt:  0 £t:  icp(t)-cr(t)l  £D),  we  must  establish 


13We  use  the  notation  xe  [a,  b]  !o  denote  min(a,b)$x£max(a<b). 
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(A  13) 


(Vr:  t,p*1St<tj>+2  At^'Srcr^2:  lCp+t(t)-cj+l(t)l£D)  and 

(Vr:  tj,+1£t£t},+1+ai  Ar{+1Srs'rj+1+«:  lcp(r)-c^(r)l  SD  a 

lc;+,(0-c^'(OI^A  (A  14) 

ic;<o-c^+1(oiio). 

Since  according  to  definition  (2.5),  a  virtual  clock  cp  satisfies  the  definition  of  a  fixed  clock,  by 

A  mm 

choosing  5  k  a(8+2p(ai)),  the  following  theorem  proves  that  Virtual  Synchronization  (2.3)  holds  for 
virtual  clocks  that  use  FIX  to  implement  continuous  resynchronization. 

Theorem  4:  If  (Vr:  OSr:  lcp(r)-c,(r)l  £8) 

then  (Vr:  OSr  lcp(i)-c,(r)l  S  a(8+2p(m))). 

Proof:  The  result  follows  if  we  prove  (A13)  and  (A14)  for  D-a(b+2p(ai))).  Using  the 
definition  of  cp,  we  rewrite  the  hypothesis  of  the  theorem  as: 

(Vr:  r^Sr a  r'Sr<r'+l :  I c/(r)-c>(r) I £8).  (A15) 

This  implies  (A13)  if  8Scc(8+2p(aO).  To  see  that  8Sa(8+2p(ar)),  first  note  that  8s8+2p (ai) 
since  OSp  and  0 Sar.  The  result  then  follows  because  8Sa(5)  due  to  (4.1). 

All  that  remains  is  to  prove  (A14).  Due  to  Hardware  Rate  (2.2),  cp  and  c}q  can  drift  apart 
by  at  most  2p  seconds  per  second.  Therefore,  we  can  extend  the  range  of  (A  15)  as  follows: 


(Vr,:  t‘p£t <t‘pl +ai  Ar£sr<r£+1+<ii:  lc;(r)-c'(r)l s8+2p(«)).  (A  16) 

By  definition,  t‘p<tp+1  <tp+1  +ai  and  t{  <r£+l  <r{+1  +ai.  So,  from  (A  16)  we  conclude 

(Vr:  rJ+1Sr<rj,+1+af  a  it  <tfl +ai:  lc;(r)-c>(r)IS8+2p(m)).  (A17) 

From  property  (4.1)  of  a,  6+2p(a»')£a(8+2p(aO),  so 

(Vr:  i;^;<r^+aiAti+1Sr<r{+4ai:  lc;(r)-c>(r)l <a(S+2p(a<))).  (A18) 

According  to  the  Accuracy  Preservation  Property  using  8=8+2p(a/)  due  to  (A  16),  and 
using  the  same  argument  as  was  used  to  change  the  range  of  (A15)  to  get  (A17),  we  conclude: 

(Vr:  r£+ 1  Sr  < rj,+1  +ai  a  r{+1  it  < t{+l  +ai :  lc;+1(>)-^(r)l  Sa(8+2p(ai)))  (A19) 

(Vr:  rj,+1  Sr<r),+1  +ai  a  r{+l  Sr  <rj+1  +ar:  I c,*(r)-c'+l(r) I  Sa(8+2p(ar)))  (A20) 

We  can  now  combine  (A18),  (A19),  and  (A20)  obtaining  (A14)  with  a(8+2p {ai))~D.  This, 
then,  completes  the  proof.  □ 


Rates  of  Virtual  Clocks 

To  prove  that  virtual  docks  satisfy  Virtual  Rate  (2.4),  we  first  require  the  following  technical 
lemma. 

LemmaS:  If  e20  then  (max  x.y.z:  min((x+e)-y,z)-min(x-y,z))£e. 

Proof:  First,  suppose  z£x-y.  This  implies  that  min(x-y,z)=z  and  that  z£(x+£)-y.  Conse¬ 
quently,  min((x+e)-y,z)=z.  This  means  that  when  z£x-y, 

min((x+e)-y,z)-min(x-y,z)  =  z-z  =  0. 

Next,  suppose  z>x-y.  This  implies  that  min(x-y,z)=x-y.  Therefore, 

min((x+E)-y,z)-min(x-y,z) 

=min((x+e)-y,z)-(x-v) 

=min((x +e)-y -(x -y),  z-(x-y )) 

=min(e,z-x+y) 

Se 

Since  when  vSe  then  max(0,  v)Se,  the  lemma  follows.  □ 


We  are  now  able  to  prove  that  virtual  clocks  have  rates  that  satisfy  Virtual  Rate  (2.4). 
Theorem  6:  If  (i)  0<k£k 

(ii)  I  cq(t)-cp(t)  I  £  8 

0(6)1 


(iii)  (1+p) 


1+- 


AI 


S  (1+p) 


(iv)  (1-p)  S  (1-p) 


1- 


a(8) 


AI 


.  c-(r+K)-c-(f)  ,  «  ,  „ 

thenO  <  1-p  s  — - 7 — - —  S  1+p  forOSr. 

K 


Proof:  Let  i  satisfy  t'p Sr  Consequently,  cP(t)=Cp(t).  Observe  that  t+kst‘pH  because 
otherwise  we  would  have 

r*,sr<rj,+l  <r+te, 

which  would  imply  that  p  started  Cp*]  between  two  virtual  clock  ticks,  contrary  to  the  protocol  of 

§2. 
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We  ire  interested  in  bounding 

c;(t+K)-a;(o 

A 

X 

[c>(f+K)+F«i(c,(f+K))]  -  [^(f)+F«;(cp(f))] 


c.(f+K)-c,(o+/?ffUc,(i+K))-Fa;(c,(/)) 


First,  we  derive  an  upper  bound  for  (A21).  For  a  correct  processor p,  we  have 
<V(f+K)-c,(f)  S  (l+p)< 

from  Hardware  Rate  (2.2).  According  to  the  definition  of  FIX‘P  (in  §2),  we  have 
™‘(c,(1+*»  -  -^-Xmin(c '(,+K)-c,(U.An) 

Af 

rrv.  /_  ~ ^Jp1  Xmin(cp(0-c„(r‘ ).  A/ )) 

FlXpiCpit))  *  ad/,  ‘+ - — - 

Therefore, 

m*(c,(r+K))-F/4((c,(r»  = 

(adfp-ad/p-' )(min(cp(r+K)-cp(rj,),  AI)  -  min(c,(r)-c,(rj,),  A/)) 


x+e  *  c,(r+K) 

*  =  cP(f) 

>  =  Cp(tp) 
z  =AI 

due  to  Hypothesis  (i)  and  the  fact  that  hardware  clocks  are  non-decreasing,  we  have  e£0.  Thus, 
we  can  apply  Lemma  5  to  infer  from  (A23): 

,  a  ,•  ( adjp-adjp~l)(e ) 

FlX'p(cp(t+K))-FIX‘p((Cp(t))  S  — JJL  -Jp - 

(adfp  -  adjp~  l)(cp(t+k)-cp(t)) 


(A24) 


AI 


Substituting  into  (A21),  using  (A22)  for  cp(t+k)-cp(t)  and  (A24)  for 
FIXlp(cp(t+k))-FlX‘p(.cp(t))  we  get 

adjp-adjp~1^ 


*  (1+p) 


1+- 


AI 


According  to  (4.2), 

-a(8)S  I  adjp -adj'~x  I  Sa(6)  (A25) 

since  Hypothesis  (ii)  stipulates  that  virtual  clocks  are  synchronized  to  within  8.  Therefore, 


c;(r+K)-a;(o 


*  (1+P) 


1+ 


o(8) 


AI 


Thus,  using  Hypothesis  (iii)  and  transitivity,  we  get 


asdes:  _d. 

Next,  we  derive  a  lower  bound  for  (A21).  According  to  Hardware  Rate  (2.2),  we  conclude 
j-p)KS  cp(t+k)-cp(t).  (A26) 

A 

Usi-  he  same  argument  as  above  with  -a(8)  as  the  lower  bound  for  the  value  of  adjj, -adj‘~l 
(due  to  (A25)),  we  get 


(1~M 


1- 


o(8) 

AI 


ci(r+K)-c*(  0 


Thus,  using  Hypothesis  (iv)  we  get 
(1_p)  <;  - - - - £ - 


as  desired. 
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Appendix  2:  Glossary  of  Notation 

The  section  in  which  each  tenn  is  defined  appears  in  parenthesis  at  the  end  of  the  entry  for  that  term. 

ai  Maximum  number  of  real  seconds  in  adjustment  interval  over  which  resynchronization  is 
spread  (Appendix  1). 

AI  Number  of  clock  seconds  in  adjustment  interval  over  which  resynchronization  is  spread 

(§2). 

adjp  c'p(t) «  Cp(t)+adjp  except  during  die  adjustment  interval  (§2). 
cp  Hardware  clock  at  processor  p  (§2). 

cp  Virtual  clock  at  processor  p  (§2). 

Cp  i‘k  virtual  clock  at  processor  p  (§2). 

cp  Virtual  clock  at  processor p  using  instantaneous  resynchronization  (Appendix  1). 

cp  Fixed  clock  at  processor  p  (Appendix  1). 

FIXlp  Correction  factor  to  spread  adjp~l  -adjp  over  AI  and  transform  cp  into  clp  (§2). 

Minimum  real  time  between  successive  events  produced  by  the  reliable  time  source  (§2). 
Tnax  Maximum  real  time  between  successive  events  produced  by  the  reliable  time  source  (§2). 

R  Resynchronization  interval  in  clock  seconds  (§2). 
rj,  Real  time  p  starts  Cp  (§2). 

t/rrs  Real  time  the  reliable  time  source  generates  the  ilh  event  (§2). 

Vp  Value  provided  by  reliable  time  source  to  p  for  starting  clp  (§2). 

ax (6)  Accuracy  of  convergence  function  CFx  (§4). 

p  Maximum  real  time  delay  between  generation  of  an  event  by  the  reliable  time  source  and 
detection  of  that  even  by  a  correct  processor  (§2). 

fmax  Maximum  delay  according  to  the  clock  at  any  correct  processor  to  send  a  message  from  one 
processor  to  another  (§3). 

Minimum  delay  according  to  the  clock  at  any  correct  processor  to  send  a  message  from  one 
processor  to  another  (§3). 
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5  Difference  between  any  two  correct  virtual  clocks  ($2). 

6 $  Bound — at  the  instant  both  have  been  started— on  the  difference  between  any  two  identi¬ 

cally  superscripted  correct  virtual  clocks  that  use  instantaneous  resynchronization  (§7). 

<  Real  time  tick  width  for  cp  (§2). 

ic  Real  time  width  of  a  tick  by  cp  (§2). 

k‘p(q)  Bound  on  error  in  p's  approximation  of  (§3). 

A  Maximum  clock  reading  error  for  any  pah  of  processors  (§3). 

it 

(t  Upper  bound  on  cp(0)  (§2).  * 

itx  (S,  e)  Precision  of  convergence  function  CFx  (§4). 
p  Upper  bound  on  cp  drift  rate  (§2). 

p  Upper  bound  on  cp  drift  rate  (§2). 

Approximation  of  c^(rj,+1  )-cp(t)  (§3). 
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